Jeollanam-do
- Asia > South Korea > Jeollanam-do > Muan (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
MOGRAS: Human Motion with Grasping in 3D Scenes
Bhosikar, Kunal, Katageri, Siddharth, Madhavaram, Vivek, Han, Kai, Sharma, Charu
Generating realistic full-body motion interacting with objects is critical for applications in robotics, virtual reality, and human-computer interaction. While existing methods can generate full-body motion within 3D scenes, they often lack the fidelity for fine-grained tasks like object grasping. Conversely, methods that generate precise grasping motions typically ignore the surrounding 3D scene. This gap, generating full-body grasping motions that are physically plausible within a 3D scene, remains a significant challenge. To address this, we introduce MOGRAS (Human MOtion with GRAsping in 3D Scenes), a large-scale dataset that bridges this gap. MOGRAS provides pre-grasping full-body walking motions and final grasping poses within richly annotated 3D indoor scenes. We leverage MOGRAS to benchmark existing full-body grasping methods and demonstrate their limitations in scene-aware generation. Furthermore, we propose a simple yet effective method to adapt existing approaches to work seamlessly within 3D scenes. Through extensive quantitative and qualitative experiments, we validate the effectiveness of our dataset and highlight the significant improvements our proposed method achieves, paving the way for more realistic human-scene interactions.
- North America > United States (0.04)
- Asia > South Korea > Jeollanam-do > Muan (0.04)
- Asia > India > Telangana > Hyderabad (0.04)
- Asia > China > Hong Kong (0.04)
Scaling Up Temporal Domain Generalization via Temporal Experts Averaging
Liu, Aoming, Miller, Kevin, Saligrama, Venkatesh, Saenko, Kate, Gong, Boqing, Lim, Ser-Nam, Plummer, Bryan A.
Temporal Domain Generalization (TDG) aims to generalize across temporal distribution shifts, e.g., lexical change over time. Prior work often addresses this by predicting future model weights. However, full model prediction is prohibitively expensive for even reasonably sized models. Thus, recent methods only predict the classifier layer, limiting generalization by failing to adjust other model components. To address this, we propose Temporal Experts Averaging (TEA), a novel and scalable TDG framework that updates the entire model using weight averaging to maximize generalization potential while minimizing computational costs. Our theoretical analysis guides us to two steps that enhance generalization to future domains. First, we create expert models with functional diversity yet parameter similarity by fine-tuning a domain-agnostic base model on individual temporal domains while constraining weight changes. Second, we optimize the bias-variance tradeoff through adaptive averaging coefficients derived from modeling temporal weight trajectories in a principal component subspace. Expert's contributions are based on their projected proximity to future domains. Extensive experiments across 7 TDG benchmarks, 5 models, and 2 TDG settings shows TEA outperforms prior TDG methods by up to 69% while being up to 60x more efficient.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
- Europe > Greece (0.04)
- Asia > South Korea > Jeollanam-do > Muan (0.04)
- Asia > South Korea > Jeollanam-do > Muan (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Kernel Quantile Embeddings and Associated Probability Metrics
Naslidnyk, Masha, Chau, Siu Lun, Briol, François-Xavier, Muandet, Krikamol
Embedding probability distributions into reproducing kernel Hilbert spaces (RKHS) has enabled powerful nonparametric methods such as the maximum mean discrepancy (MMD), a statistical distance with strong theoretical and computational properties. At its core, the MMD relies on kernel mean embeddings to represent distributions as mean functions in RKHS. However, it remains unclear if the mean function is the only meaningful RKHS representation. Inspired by generalised quantiles, we introduce the notion of kernel quantile embeddings (KQEs). We then use KQEs to construct a family of distances that: (i) are probability metrics under weaker kernel conditions than MMD; (ii) recover a kernelised form of the sliced Wasserstein distance; and (iii) can be efficiently estimated with near-linear cost. Through hypothesis testing, we show that these distances offer a competitive alternative to MMD and its fast approximations.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
Trustworthy Transfer Learning: A Survey
Transfer learning aims to transfer knowledge or information from a source domain to a relevant target domain. In this paper, we understand transfer learning from the perspectives of knowledge transferability and trustworthiness. This involves two research questions: How is knowledge transferability quantitatively measured and enhanced across domains? Can we trust the transferred knowledge in the transfer learning process? To answer these questions, this paper provides a comprehensive review of trustworthy transfer learning from various aspects, including problem definitions, theoretical analysis, empirical algorithms, and real-world applications. Specifically, we summarize recent theories and algorithms for understanding knowledge transferability under (within-domain) IID and non-IID assumptions. In addition to knowledge transferability, we review the impact of trustworthiness on transfer learning, e.g., whether the transferred knowledge is adversarially robust or algorithmically fair, how to transfer the knowledge under privacy-preserving constraints, etc. Beyond discussing the current advancements, we highlight the open questions and future directions for understanding transfer learning in a reliable and trustworthy manner.
- Asia > Middle East > Jordan (0.04)
- North America > United States > California (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (9 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.47)
- Research Report > New Finding (0.34)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
- (2 more...)
Machine Learning-Based Channel Prediction for RIS-assisted MIMO Systems With Channel Aging
Ginige, Nipuni, de Sena, Arthur Sousa, Mahmood, Nurul Huda, Rajatheva, Nandana, Latva-aho, Matti
Reconfigurable intelligent surfaces (RISs) have emerged as a promising technology to enhance the performance of sixth-generation (6G) and beyond communication systems. The passive nature of RISs and their large number of reflecting elements pose challenges to the channel estimation process. The associated complexity further escalates when the channel coefficients are fast-varying as in scenarios with user mobility. In this paper, we propose an extended channel estimation framework for RIS-assisted multiple-input multiple-output (MIMO) systems based on a convolutional neural network (CNN) integrated with an autoregressive (AR) predictor. The implemented framework is designed for identifying the aging pattern and predicting enhanced estimates of the wireless channels in correlated fast-fading environments. Insightful simulation results demonstrate that our proposed CNN-AR approach is robust to channel aging, exhibiting a high-precision estimation accuracy. The results also show that our approach can achieve high spectral efficiency and low pilot overhead compared to traditional methods.
- Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
- North America > United States (0.04)
- Asia > South Korea > Jeollanam-do > Muan (0.04)
Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
Kil, Jihyung, Zhang, Cheng, Xuan, Dong, Chao, Wei-Lun
Visual question answering (VQA) is challenging not only because the model has to handle multi-modal information, but also because it is just so hard to collect sufficient training examples -- there are too many questions one can ask about an image. As a result, a VQA model trained solely on human-annotated examples could easily over-fit specific question styles or image contents that are being asked, leaving the model largely ignorant about the sheer diversity of questions. Existing methods address this issue primarily by introducing an auxiliary task such as visual grounding, cycle consistency, or debiasing. In this paper, we take a drastically different approach. We found that many of the "unknowns" to the learned VQA model are indeed "known" in the dataset implicitly. For instance, questions asking about the same object in different images are likely paraphrases; the number of detected or annotated objects in an image already provides the answer to the "how many" question, even if the question has not been annotated for that image. Building upon these insights, we present a simple data augmentation pipeline SimpleAug to turn this "known" knowledge into training examples for VQA. We show that these augmented examples can notably improve the learned VQA models' performance, not only on the VQA-CP dataset with language prior shifts but also on the VQA v2 dataset without such shifts. Our method further opens up the door to leverage weakly-labeled or unlabeled images in a principled way to enhance VQA models. Our code and data are publicly available at https://github.com/heendung/simpleAUG.
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- Asia > South Korea > Jeollanam-do > Muan (0.04)
Deep Learning Methods for Proximal Inference via Maximum Moment Restriction
Kompa, Benjamin, Bellamy, David R., Kolokotrones, Thomas, Robins, James M., Beam, Andrew L.
The No Unmeasured Confounding Assumption is widely used to identify causal effects in observational studies. Recent work on proximal inference has provided alternative identification results that succeed even in the presence of unobserved confounders, provided that one has measured a sufficiently rich set of proxy variables, satisfying specific structural conditions. However, proximal inference requires solving an ill-posed integral equation. Previous approaches have used a variety of machine learning techniques to estimate a solution to this integral equation, commonly referred to as the bridge function. However, prior work has often been limited by relying on pre-specified kernel functions, which are not data adaptive and struggle to scale to large datasets. In this work, we introduce a flexible and scalable method based on a deep neural network to estimate causal effects in the presence of unmeasured confounding using proximal inference. Our method achieves state of the art performance on two well-established proximal inference benchmarks. Finally, we provide theoretical consistency guarantees for our method.
- Asia > South Korea > Jeollanam-do > Muan (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
An Efficient Egocentric Regulator for Continuous Targeting Problems of the Underactuated Quadrotor
Lin, Ziying, Dong, Wei, Liu, Sensen, Sheng, Xinjun, Zhu, Xiangyang
Flying robots such as the quadrotor could provide an efficient approach for medical treatment or sensor placing of wild animals. In these applications, continuously targeting the moving animal is a crucial requirement. Due to the underactuated characteristics of the quadrotor and the coupled kinematics with the animal, nonlinear optimal tracking approaches, other than smooth feedback control, are required. However, with severe nonlinearities, it would be time-consuming to evaluate control inputs, and real-time tracking may not be achieved with generic optimizers onboard. To tackle this problem, a novel efficient egocentric regulation approach with high computational efficiency is proposed in this paper. Specifically, it directly formulates the optimal tracking problem in an egocentric manner regarding the quadrotor's body coordinates. Meanwhile, the nonlinearities of the system are peeled off through a mapping of the feedback states as well as control inputs, between the inertial and body coordinates. In this way, the proposed efficient egocentric regulator only requires solving a quadratic performance objective with linear constraints and then generate control inputs analytically. Comparative simulations and mimic biological experiment are carried out to verify the effectiveness and computational efficiency. Results demonstrate that the proposed control approach presents the highest and stablest computational efficiency than generic optimizers on different platforms. Particularly, on a commonly utilized onboard computer, our method can compute the control action in approximately 0.3 ms, which is on the order of 350 times faster than that of generic nonlinear optimizers, establishing a control frequency around 3000 Hz.